Towards English-to-Czech MT via Tectogrammatical Layer

نویسندگان

  • Ondrej Bojar
  • Silvie Cinková
  • Jan Ptácek
چکیده

We present an overview of an English-to-Czech machine translation system. The system relies on transfer at the tectogrammatical (deep syntactic) layer of the language description. We report on the progress of linguistic annotation of English tectogrammatical layer and also on first end-to-end evaluation of our syntax-based MT system.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Alignment of Czech and English Deep Syntactic Dependency Trees⋆

In this paper, we focus on alignment of Czech and English tectogrammatical dependency trees. The alignment of deep syntactic dependency trees can be used for training transfer models for machine translation systems based on analysis-transfer-synthesis architecture. The results of our experiments show that shifting the alignment task from the word layer to the tectogrammatical layer both (a) inc...

متن کامل

Czech-English Dependency-based Machine Translation

We present some preliminary results of a Czech-English translation system based on dependency trees. The fully automated process includes: morphological tagging, analytical and tectogrammatical parsing of Czech, tectogrammatical transfer based on lexical substitution using word-to-word translation dictionaries enhanced by the information from the English-Czech parallel corpus of WSJ, and a simp...

متن کامل

Czech-English Dependency Tree-based Machine Translation

We present some preliminary results of a Czech-English translation system based on dependency trees. The fully automated process includes: morphological tagging, analytical and tectogrammatical parsing of Czech, tectogrammatical transfer based on lexical substitution using word-to-word translation dictionaries enhanced by the information from the English-Czech parallel corpus of WSJ, and a simp...

متن کامل

Improving English-Czech Tectogrammatical MT

The present paper summarizes our recent results concerning English-CzechMachine Translation implemented in the TectoMT framework. The system uses tectogrammatical trees as the transfer medium. A detailed analysis of errors made by the previous version of the system (considered as the baseline) is presented first. Then several improvements of the system are described that led to better translati...

متن کامل

Announcing Prague Czech-English Dependency Treebank 2.0

We introduce a substantial update of the Prague Czech-English Dependency Treebank, a parallel corpus manually annotated at the deep syntactic layer of linguistic representation. The English part consists of the Wall Street Journal (WSJ) section of the Penn Treebank. The Czech part was translated from the English source sentence by sentence. This paper gives a high level overview of the underlyi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Prague Bull. Math. Linguistics

دوره 90  شماره 

صفحات  -

تاریخ انتشار 2008